Loading BokehJS ...

A Systemic Approach to Political Homophily

Exploring Predictors of County-level Political Homophily through Geographically Weighted Regression

This is an abridged version of the results. For more details of the LASSO models and interactive maps showing how coefficients for different variables vary spatially, please download Full_Results.html here and open it with your browser.

Are people more likely to connect online with others who share their political opinions? Do people unfriend those with an opposing political orientation? The relationship between political attitudes and social networks has been extensively examined in the literature of political science, sociology, and communication, mostly at the individual level. Scholars tend to approach political homophily as an outcome of individual psychological/behavioral inclinations. However, social networks are reflections of our lifeworlds, and politics are always grounded in local contexts. Local socio-political environment can cast an influence people's online connections, but such ecosystem variables are less explored by existing research.

Focusing on county-level social media connection allows us to bring in contextual variables to test factors that are positively/negatively associated with cross-cutting social connections. Geographically weighted regression further allows us to explore the variation of these relationships across space, achieving a nuanced depiction of factors influencing cross-cutting connections in counties all over the U.S.

Variables

Dependent Variables¶

for index, row in format_df.iterrows(): p = make_plot(format_df, row['field'], "Physical", 'dv') show(p)Cross-cutting connection rate on Facebook: a county’s ratio of cross-cutting to likeminded Facebook friends with other counties.

Cross-cutting connection rate based on human mobility data: a county’s ratio of cross-cutting to likeminded physical travel to other counties.

$$CrossCuttingConnectivityRate_{i} = \sum \limits _{j=0} ^{n-1} \frac{1 - (DemRatio_{i} DemRatio_{j} + GOPRatio_{i} GOPRatio_{j})}{DemRatio_{i} DemRatio_{j} + GOPRatio_{i} GOPRatio_{j}} * Connectivity_{ij}$$

Independent Variables¶

Population¶

Based on data provided by the Census Bureau

Total population: Total population according to the U.S. census 2021 population estimate.

Birth rate: Birth rate in period 7/1/2020 to 6/30/2021.

Death rate: Death rate in period 7/1/2020 to 6/30/2021.

International migration rate: Net international migration rate (ratio of net international migration to population number) in period 7/1/2020 to 6/30/2021.

Domestic migration rate: Net domestic migration rate (ratio of net domestic migration to population number) in period 7/1/2020 to 6/30/2021.

Male proportion: The proportion of male within the general population according to the U.S. census 2021 population estimate.


Race & Ethnicity¶

Based on data provided by the Census Bureau

Mixed proportion: Proportion of two or more races population according to the U.S. census 2021 population estimate.

White and combined proportion: Proportion of white alone or in combination population according to the U.S. census 2021 population estimate.

Black and combined proportion: Proportion of black alone or in combination population according to the U.S. census 2021 population estimate.

Indigenous and combined proportion: Proportion of American Indian and Alaska Native alone or in combination population according to the U.S. census 2021 population estimate.

AANHPI proportion: Proportion of Asian American, Native Hawaiian, and Pacific Islander alone or in combination population according to the U.S. census 2021 population estimate.

Hispanic proportion: Proportion of Hispanic population according to the U.S. census 2021 population estimate.


Socio-Economics¶

Based on data provided by the Bureau of Economic Analysis

Education: Calculated based on the percent of adults with different education levels. Pre_highschool_rate 1 + Highschool_rate 2 + Some_college_rate 3 + Bachelor_plus_rate 4

  • Pre_highschool_rate (reference variable for education): Proportion of adults with less than a high school diploma, 2017-21
  • Highschool_rate: Proportion of adults a high school diploma only, 2017-21
  • Some_college_rate: Proportion of adults completing some college or an Associate’s degree, 2017-21
  • Bachelor_plus_rate: Proportion of adults with a Bachelor’s degree or higher, 2017-21

Median household income: Estimate of median household income, 2021

Unemployment rate 2021: Number of 2021 annual unemployment divided by the U.S. census 2021 population estimate.

Unemployment rate 3 years: Mean of 2019-2021 annual unemployment divided by the U.S. census 2021 population estimate.

Unemployment rate 5 years: Mean of 2017-2021 annual unemployment divided by the U.S. census 2021 population estimate.

Unemployment rate 10 years: Mean of 2012-2021 annual unemployment divided by the U.S. census 2021 population estimate.

Per capita GDP 2021: Real 2021 per capita gross domestic product in thousands of chained (2012) dollars.

GDP change 2021: Real gross domestic product change between 2020 and 2021.

Per capita GDP 3 years: Average of real per capita 2019-2021 gross domestic product in thousands of chained (2012) dollars.

Poverty rate: Estimate of people of all ages in poverty 2021 divided by the U.S. census 2021 population estimate.


Partisanship¶

Based on 2020 election results data provided by the MIT Election Data and Science Lab

Distance-based local partisan difference: A county’s difference with nearby counties in 2020 Presidential Election results weighted based on the county’s distance to nearby counties. The weights are gaussian kernel weights with adaptive bandwidth (as a function of unit density).

Neighbor-based local partisan difference: A county’s difference with adjacent counties (sharing either border or vertex) in 2020 Presidential Election results.

Democrat-Republican ratio: A county’s Democratic-to-Republican vote share in the 2020 Presidential Election.


Local News¶

Based on the UNC News Desert project

Newspaper publication days: Number of days in a week that any local newspaper publishes.

Newspaper count: Number of local newspapers in a county.

Number of newspapers not owned by state conglomerates: Number of newspapers not owned by conglomerates defined as organizations that own newspapers in 2 or more states.

Number of newspapers not owned by county conglomerates: Number of newspapers not owned by conglomerates defined as organizations that own newspapers in 3 or more counties.

TV count: Number of public TV stations in a county.

Radio count: Number of public radio stations in a county.

Original content radio count: Number of public radio stations in a county that produce original content.

Non-original content radio count: Number of public radio stations in a county that do NOT produce original content.


Communities & Networks¶

*Based on the Social Capital Atlas dataset.

Cross-class connectedness: Calculated based on Facebook friend network – two times the share of high-SES friends among low-SES individuals, averaged over all low-SES individuals in the county.

Cross-class exposure: Mean exposure to high-SES individuals by county for low-SES individuals – two times the average share of high-SES individuals in individuals’ groups, averaged over low-SES users.

Clustering: The average fraction of an individual’s friend pairs who are also friends with each other.

Support ratio: The proportion of within-county friendships where the pair of friends share a third mutual friend within the same county.

Volunteering rate: The percentage of Facebook users who are members of a group which is predicted to be about ‘volunteering’ or ‘activism’ based on group title and other group characteristics.

Civic organization density: The number of Facebook Pages predicted to be “Public Good” pages based on page title, category, and other page characteristics, per 1,000 users in the county.

Analyses

LASSO¶

Given the large number of variables we have and geographically weighted regression's tendency to overfit, we first built LASSO models to filter out insignificant variables. We obtained a list of variables whose phi value is larger than .01 in the first model and .005 in the second and third models. After multicolinearity tests, here are the variables we selected for the GWR model:

  • Total population
  • Domestic migration rate
  • International migration rate

  • AANHPI proportion

  • Mixed proportion
  • Hispanic proportion

  • Per capita GDP 3 years

  • Poverty rate

  • Democrat-Republican ratio

  • Distance-based local partisan difference

  • Newspaper publication days

  • Number of newspapers not owned by state conglomerates
  • Original content radio count

  • Cross-class friendship

Geographically Weighted Regression¶

Model 1: Cross-Cutting Physical Connection¶

===========================================================================
Model type                                                         Gaussian
Number of observations:                                                3112
Number of covariates:                                                    15

Global Regression Results
---------------------------------------------------------------------------
Residual sum of squares:                                           2356.620
Log-likelihood:                                                   -3983.111
AIC:                                                               7996.222
AICc:                                                              7998.398
BIC:                                                             -22552.615
R2:                                                                   0.243
Adj. R2:                                                              0.239

Variable                              Est.         SE  t(Est/SE)    p-value
------------------------------- ---------- ---------- ---------- ----------
X0                                  -0.000      0.016     -0.000      1.000
X1                                  -0.059      0.025     -2.331      0.020
X2                                   0.106      0.017      6.266      0.000
X3                                  -0.025      0.019     -1.356      0.175
X4                                  -0.024      0.016     -1.481      0.139
X5                                  -0.205      0.023     -8.922      0.000
X6                                   0.006      0.025      0.235      0.814
X7                                  -0.008      0.021     -0.361      0.718
X8                                  -0.026      0.017     -1.478      0.139
X9                                  -0.091      0.023     -4.017      0.000
X10                                  0.156      0.020      7.703      0.000
X11                                  0.359      0.019     18.832      0.000
X12                                 -0.076      0.022     -3.505      0.000
X13                                  0.156      0.024      6.510      0.000
X14                                  0.034      0.018      1.834      0.067

Geographically Weighted Regression (GWR) Results
---------------------------------------------------------------------------
Spatial kernel:                                           Adaptive bisquare
Bandwidth used:                                                     200.000

Diagnostic information
---------------------------------------------------------------------------
Residual sum of squares:                                            572.013
Effective number of parameters (trace(S)):                          526.031
Degree of freedom (n - trace(S)):                                  2585.969
Sigma estimate:                                                       0.470
Log-likelihood:                                                   -1780.093
AIC:                                                               4614.249
AICc:                                                              4829.645
BIC:                                                               7799.110
R2:                                                                   0.816
Adjusted R2:                                                          0.779
Adj. alpha (95%):                                                     0.001
Adj. critical t value (95%):                                          3.192

Summary Statistics For GWR Parameter Estimates
---------------------------------------------------------------------------
Variable                   Mean        STD        Min     Median        Max
-------------------- ---------- ---------- ---------- ---------- ----------
X0                        0.126      0.527     -1.463      0.160      1.425
X1                       -0.166      0.428     -2.500     -0.110      1.882
X2                        0.047      0.144     -0.316      0.030      0.639
X3                       -0.003      0.128     -0.752      0.000      0.437
X4                       -6.830     12.320    -54.989     -5.017     24.648
X5                       -0.193      0.250     -1.028     -0.205      0.566
X6                        0.042      0.534     -1.064     -0.066      2.979
X7                        0.148      0.328     -0.967      0.170      1.045
X8                        0.153      0.327     -0.791      0.142      1.466
X9                        0.073      0.177     -0.640      0.075      0.645
X10                       0.776      1.037     -0.475      0.439      5.555
X11                       0.139      0.143     -0.197      0.128      0.633
X12                      -0.038      0.111     -0.379     -0.025      0.392
X13                       0.196      0.222     -0.140      0.131      1.192
X14                      -0.039      0.115     -0.918     -0.029      0.685
===========================================================================

Factors Associated w. Cross-Cutting Physical Connectivity Ratio¶
  • Positive
    • Domestic Migration Rate ($\beta$ = .106, p < .001)
    • Democrat to Republican Ratio ($\beta$ = .156, p < .001)
    • Distance-Based Local Partisan Difference ($\beta$ = .359, p < .001)
    • Newspaper Publication Days ($\beta$ = .156, p < .001)
  • Negative
    • Total Population ($\beta$ = -.059, p = .020)
    • Poverty Rate ($\beta$ = -.205, p < .001)
    • Cross-Class Friendship ($\beta$ = -.091, p < .001)
    • Number of Newpapers NOT Owned by State Conglomerates ($\beta$ = -.076, p = .001)
  • Insignificant
    • International Migration Rate ($\beta$ = -.025, p = .175)
    • 3-Year Average Per Capita GDP ($\beta$ = -.024, p = .139)
    • AANHPI Proportion ($\beta$ = .006, p = .814)
    • Mixed Proportion ($\beta$ = -.008, p = .718)
    • Hispanic Proportion ($\beta$ = -.026, p = .139)
    • Original Content Radio Count ($\beta$ = .034, p = .067)

Model 2: Cross-Cutting Facebook Connection (cross-cutting physical connection NOT included in IVs)¶

===========================================================================
Model type                                                         Gaussian
Number of observations:                                                3112
Number of covariates:                                                    15

Global Regression Results
---------------------------------------------------------------------------
Residual sum of squares:                                           1589.540
Log-likelihood:                                                   -3370.383
AIC:                                                               6770.766
AICc:                                                              6772.942
BIC:                                                             -23319.696
R2:                                                                   0.489
Adj. R2:                                                              0.487

Variable                              Est.         SE  t(Est/SE)    p-value
------------------------------- ---------- ---------- ---------- ----------
X0                                   0.000      0.013      0.000      1.000
X1                                  -0.014      0.021     -0.672      0.501
X2                                   0.042      0.014      3.052      0.002
X3                                  -0.006      0.015     -0.427      0.670
X4                                  -0.022      0.013     -1.712      0.087
X5                                  -0.149      0.019     -7.862      0.000
X6                                   0.111      0.020      5.523      0.000
X7                                  -0.027      0.017     -1.595      0.111
X8                                  -0.037      0.014     -2.634      0.008
X9                                  -0.128      0.019     -6.907      0.000
X10                                  0.410      0.017     24.709      0.000
X11                                  0.282      0.016     18.042      0.000
X12                                 -0.068      0.018     -3.823      0.000
X13                                  0.124      0.020      6.326      0.000
X14                                  0.091      0.015      5.985      0.000

Geographically Weighted Regression (GWR) Results
---------------------------------------------------------------------------
Spatial kernel:                                           Adaptive bisquare
Bandwidth used:                                                     200.000

Diagnostic information
---------------------------------------------------------------------------
Residual sum of squares:                                            253.669
Effective number of parameters (trace(S)):                          526.031
Degree of freedom (n - trace(S)):                                  2585.969
Sigma estimate:                                                       0.313
Log-likelihood:                                                    -514.862
AIC:                                                               2083.787
AICc:                                                              2299.183
BIC:                                                               5268.648
R2:                                                                   0.918
Adjusted R2:                                                          0.902
Adj. alpha (95%):                                                     0.001
Adj. critical t value (95%):                                          3.192

Summary Statistics For GWR Parameter Estimates
---------------------------------------------------------------------------
Variable                   Mean        STD        Min     Median        Max
-------------------- ---------- ---------- ---------- ---------- ----------
X0                        0.116      0.415     -1.221      0.141      1.049
X1                       -0.059      0.315     -1.219     -0.059      2.418
X2                        0.011      0.099     -0.331      0.009      0.324
X3                        0.002      0.094     -0.718      0.004      0.407
X4                       -2.483      7.869    -31.598     -2.461     27.299
X5                       -0.081      0.144     -0.430     -0.099      0.393
X6                        0.171      0.451     -0.567      0.051      2.961
X7                        0.126      0.282     -0.847      0.151      0.647
X8                        0.048      0.241     -0.749      0.047      1.158
X9                       -0.022      0.145     -0.723     -0.018      0.463
X10                       1.008      0.893     -0.200      0.773      4.112
X11                       0.093      0.109     -0.128      0.086      0.410
X12                      -0.018      0.077     -0.300     -0.012      0.273
X13                       0.083      0.127     -0.216      0.055      0.881
X14                       0.004      0.073     -1.048      0.010      0.336
===========================================================================

Factors Associated w. Cross-Cutting Facebook Connectivity Ratio¶
  • Positive
    • Domestic Migration Rate ($\beta$ = .042, p = .002)
    • AANHPI Proportion ($\beta$ = .111, p < .001)
    • Democrat to Republican Ratio ($\beta$ = .410, p < .001)
    • Distance-Based Local Partisan Difference ($\beta$ = .282, p < .001)
    • Newspaper Publication Days ($\beta$ = .124, p < .001)
    • Original Content Radio Count ($\beta$ = .091, p < .001)
  • Negative
    • Poverty Rate ($\beta$ = -.149, p < .001)
    • Hispanic Proportion ($\beta$ = -.037, p = .008)
    • Cross-Class Friendship ($\beta$ = -.128, p < .001)
    • Number of Newpapers NOT Owned by State Conglomerates ($\beta$ = -.068, p < .001)
  • Insignificant
    • Total Population ($\beta$ = -.014, p = .501)
    • International Migration Rate ($\beta$ = -.006, p = .670)
    • 3-Year Average Per Capita GDP ($\beta$ = -.022, p = .087)
    • Mixed Proportion ($\beta$ = -.027, p = .111)

Model 3: Cross-Cutting Facebook Connection (cross-cutting physical connection included in IVs)¶

===========================================================================
Model type                                                         Gaussian
Number of observations:                                                3112
Number of covariates:                                                    12

Global Regression Results
---------------------------------------------------------------------------
Residual sum of squares:                                            419.628
Log-likelihood:                                                   -1298.055
AIC:                                                               2620.110
AICc:                                                              2622.228
BIC:                                                             -24513.736
R2:                                                                   0.865
Adj. R2:                                                              0.865

Variable                              Est.         SE  t(Est/SE)    p-value
------------------------------- ---------- ---------- ---------- ----------
X0                                   0.000      0.007      0.000      1.000
X1                                   0.704      0.008     93.082      0.000
X2                                  -0.034      0.007     -4.808      0.000
X3                                  -0.008      0.010     -0.835      0.404
X4                                   0.099      0.008     13.160      0.000
X5                                  -0.014      0.007     -2.031      0.042
X6                                  -0.066      0.010     -6.942      0.000
X7                                   0.306      0.008     36.120      0.000
X8                                   0.030      0.008      3.609      0.000
X9                                  -0.006      0.009     -0.738      0.460
X10                                  0.028      0.009      3.070      0.002
X11                                  0.068      0.008      8.776      0.000

Geographically Weighted Regression (GWR) Results
---------------------------------------------------------------------------
Spatial kernel:                                           Adaptive bisquare
Bandwidth used:                                                     200.000

Diagnostic information
---------------------------------------------------------------------------
Residual sum of squares:                                            121.951
Effective number of parameters (trace(S)):                          432.233
Degree of freedom (n - trace(S)):                                  2679.767
Sigma estimate:                                                       0.213
Log-likelihood:                                                     624.776
AIC:                                                               -383.087
AICc:                                                              -242.579
BIC:                                                               2234.946
R2:                                                                   0.961
Adjusted R2:                                                          0.954
Adj. alpha (95%):                                                     0.001
Adj. critical t value (95%):                                          3.200

Summary Statistics For GWR Parameter Estimates
---------------------------------------------------------------------------
Variable                   Mean        STD        Min     Median        Max
-------------------- ---------- ---------- ---------- ---------- ----------
X0                        0.063      0.214     -0.476      0.042      0.933
X1                        0.508      0.152      0.034      0.523      0.895
X2                       -0.014      0.047     -0.187     -0.013      0.114
X3                        0.010      0.074     -0.221      0.012      0.199
X4                        0.140      0.155     -0.155      0.101      1.185
X5                       -0.001      0.115     -0.374      0.005      0.351
X6                       -0.053      0.071     -0.357     -0.045      0.127
X7                        0.673      0.558     -0.255      0.593      2.620
X8                        0.020      0.052     -0.161      0.016      0.214
X9                       -0.000      0.038     -0.111     -0.001      0.138
X10                      -0.014      0.064     -0.280     -0.008      0.237
X11                       0.021      0.057     -0.320      0.019      0.246
===========================================================================

Factors Associated w. Cross-Cutting Facebook Connectivity Ratio¶
  • Positive
    • Cross-Cutting Physical Connectivity Ratio ($\beta$ = .704, p < .001)
    • AANHPI Proportion ($\beta$ = .099, p < .001)
    • Democrat to Republican Ratio ($\beta$ = .306, p < .001)
    • Distance-Based Local Partisan Difference ($\beta$ = .030, p < .001)
    • Newspaper Publication Days ($\beta$ = .028, p = .002)
    • Original Content Radio Count ($\beta$ = .068, p < .001)
  • Negative
    • Domestic Migration Rate ($\beta$ = -.034, p < .001)
    • Hispanic Proportion ($\beta$ = -.014, p = .042)
    • Cross-Class Friendship ($\beta$ = -.066, p < .001)
  • Insignificant
    • Poverty Rate ($\beta$ = -.008, p = .404)
    • Number of Newpapers NOT Owned by State Conglomerates ($\beta$ = -.006, p = .460)

Summary of Global Regression Models

  1. Population: Population is not significantly associated with cross-cutting connectivity on social media.
  2. Migration:
    • Domestic migration rate is positively associated with cross-cutting physical connectivity ratio. However, after holding cross-cutting physical connectivity constant, domestic migration is negatively associated with cross-cutting social media connectivity ratio.
    • International migration rate is not associated with any type of cross-cutting connectivity ratio.
    • A potential explanation of the contradicting domestic migration coefficients in different models is as follows:
      • People who migrate domestically tend to maintain physical connection to their home county, and thus, counties with high domestic migration ratio tend to have higher cross-cutting physical connectivity ratio.
      • However, as well documented by political geography research, ideology is an important part of people's migration decision. Social media connectivity reflects more ideological factors than physical connectivity. Therefore, counties with high domestic migration ratio tend to have lower cross-cutting social media connectivity ratio.
  3. Economy:
    • Per capita GDP: Per-capita GDP is not significantly associated with any type of cross-cutting connectivity ratio.
    • Poverty rate: Poverty rate is negatively associated physical cross-cutting connectivity ratio, but after controlling of physical cross-cutting connectivity, poverty rate does not correlate with cross-cutting social media connectivity.
      • Places with more poverty are less developed, and thus, have less economic connections with other places, leading to low cross-cutting physical connectivity ratio.
      • However, less developed places are not necessarily more ideological than more developed areas. In fact, Suk et al. (2020) find that people tend to have more cross-cutting connection when the economy is not going well. Therefore, after controlling for physical cross-cutting connectivity, poverty is no longer associated with cross-cutting connectivity on social media.
  4. Race & Ethnicity:
    • Proportion of mixed-race population is not significantly associated with any type of cross-cutting connectivity ratio.
    • Proportion of AANHPI population is not significantly associated with cross-cutting physical connectivity but positively associated with cross-cutting social media connectivity.
    • Hispanic proportion is not significantly associated with cross-cutting physical connectivity, but negatively correlate with cross-cutting social media connectivity.
  5. Cross-Class Connection: Cross-class friendship is negatively correlated with cross-cutting physical and social connectivity. A potential explanation is that ideologies sometimes serve as a glue that connects people with different socio-economic statuses.
  6. Partisanship:
    • Democrat to Republican ratio is consistantly positively associated with cross-cutting connectivity ratio, even stronger on social media. This is consistent with existing research showing that Democrats engage in more cross-cutting conversation, relationships, and media consumption than Republicans.
    • Local political difference (the difference between a county's partisan composition and that of its nearby counties) is consistantly positively associated with cross-cutting connectivity ratio.
  7. Local News:
    • Number of newpapers NOT owned by state conglomerates is negatively associated with cross-cutting physcal connectivity ratio. However, after controlling for cross-cutting physical connectivity, it is no longer associated with cross-cutting social media connectivity.
      • This may have something to do with conglomerates' choice of places to establish new chains. Such choices are usually influenced by practical conditions such as a place's market size.
        • Local newspaper publication days is consistantly positively associated with cross-cutting connectivity ratio. Note that this variable considers both conglomerate-owned and non-conglomerate-owned newspapers. The directional difference between this variable and the variable above may be explained as a bridging effect of conglomerate-owned media, which is not necessarily beneficial.
    • Number of original content public radio is positively associated with cross-cutting social media connectivity, but not significantly correlated to cross-cutting physical connectivity. It may result from physical connection's practical nature. The bridging effect of public radio should also be noted.